lay summary
Leveraging Large Language Models for Zero-shot Lay Summarisation in Biomedicine and Beyond
Goldsack, Tomas, Scarton, Carolina, Lin, Chenghua
In this work, we explore the application of Large Language Models to zero-shot Lay Summarisation. We propose a novel two-stage framework for Lay Summarisation based on real-life processes, and find that summaries generated with this method are increasingly preferred by human judges for larger models. To help establish best practices for employing LLMs in zero-shot settings, we also assess the ability of LLMs as judges, finding that they are able to replicate the preferences of human judges. Finally, we take the initial steps towards Lay Summarisation for Natural Language Processing (NLP) articles, finding that LLMs are able to generalise to this new domain, and further highlighting the greater utility of summaries generated by our proposed approach via an in-depth human evaluation.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Overview of the BioLaySumm 2024 Shared Task on the Lay Summarization of Biomedical Research Articles
Goldsack, Tomas, Scarton, Carolina, Shardlow, Matthew, Lin, Chenghua
This paper presents the setup and results of the second edition of the BioLaySumm shared task on the Lay Summarisation of Biomedical Research Articles, hosted at the BioNLP Workshop at ACL 2024. In this task edition, we aim to build on the first edition's success by further increasing research interest in this important task and encouraging participants to explore novel approaches that will help advance the state-of-the-art. Encouragingly, we found research interest in the task to be high, with this edition of the task attracting a total of 53 participating teams, a significant increase in engagement from the previous edition. Overall, our results show that a broad range of innovative approaches were adopted by task participants, with a predictable shift towards the use of Large Language Models (LLMs).
- Asia > Thailand > Bangkok > Bangkok (0.06)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (5 more...)
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.68)
- Overview > Innovation (0.54)
From Complexity to Clarity: How AI Enhances Perceptions of Scientists and the Public's Understanding of Science
This paper evaluated the effectiveness of using generative AI to simplify science communication and enhance the public's understanding of science. By comparing lay summaries of journal articles from PNAS, yoked to those generated by AI, this work first assessed linguistic simplicity across such summaries and public perceptions. Study 1a analyzed simplicity features of PNAS abstracts (scientific summaries) and significance statements (lay summaries), observing that lay summaries were indeed linguistically simpler, but effect size differences were small. Study 1b used a large language model, GPT-4, to create significance statements based on paper abstracts and this more than doubled the average effect size without fine-tuning. Study 2 experimentally demonstrated that simply-written GPT summaries facilitated more favorable perceptions of scientists (they were perceived as more credible and trustworthy, but less intelligent) than more complexly-written human PNAS summaries. Crucially, Study 3 experimentally demonstrated that participants comprehended scientific writing better after reading simple GPT summaries compared to complex PNAS summaries. In their own words, participants also summarized scientific papers in a more detailed and concrete manner after reading GPT summaries compared to PNAS summaries of the same article. AI has the potential to engage scientific communities and the public via a simple language heuristic, advocating for its integration into scientific dissemination for a more informed society.
- Asia > Middle East > Jordan (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Government (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
- Health & Medicine > Therapeutic Area > Neurology (0.67)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)
ATLAS: Improving Lay Summarisation with Attribute-based Control
Zhang, Zhihao, Goldsack, Tomas, Scarton, Carolina, Lin, Chenghua
Lay summarisation aims to produce summaries of scientific articles that are comprehensible to non-expert audiences. However, previous work assumes a one-size-fits-all approach, where the content and style of the produced summary are entirely dependent on the data used to train the model. In practice, audiences with different levels of expertise will have specific needs, impacting what content should appear in a lay summary and how it should be presented. Aiming to address this, we propose ATLAS, a novel abstractive summarisation approach that can control various properties that contribute to the overall "layness" of the generated summary using targeted control attributes. We evaluate ATLAS on a combination of biomedical lay summarisation datasets, where it outperforms state-of-the-art baselines using mainstream summarisation metrics. Additional analyses provided on the discriminatory power and emergent influence of our selected controllable attributes further attest to the effectiveness of our approach.
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- (9 more...)
WisPerMed at BioLaySumm: Adapting Autoregressive Large Language Models for Lay Summarization of Scientific Articles
Pakull, Tabea M. G., Damm, Hendrik, Idrissi-Yaghir, Ahmad, Schäfer, Henning, Horn, Peter A., Friedrich, Christoph M.
This paper details the efforts of the WisPerMed team in the BioLaySumm2024 Shared Task on automatic lay summarization in the biomedical domain, aimed at making scientific publications accessible to non-specialists. Large language models (LLMs), specifically the BioMistral and Llama3 models, were fine-tuned and employed to create lay summaries from complex scientific texts. The summarization performance was enhanced through various approaches, including instruction tuning, few-shot learning, and prompt variations tailored to incorporate specific context information. The experiments demonstrated that fine-tuning generally led to the best performance across most evaluated metrics. Few-shot learning notably improved the models' ability to generate relevant and factually accurate texts, particularly when using a well-crafted prompt. Additionally, a Dynamic Expert Selection (DES) mechanism to optimize the selection of text outputs based on readability and factuality metrics was developed. Out of 54 participants, the WisPerMed team reached the 4th place, measured by readability, factuality, and relevance. Determined by the overall score, our approach improved upon the baseline by approx. 5.5 percentage points and was only approx 1.5 percentage points behind the first place.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Tennessee (0.04)
- (6 more...)
The Lay Person's Guide to Biomedicine: Orchestrating Large Language Models
Luo, Zheheng, Xie, Qianqian, Ananiadou, Sophia
Automated lay summarisation (LS) aims to simplify complex technical documents into a more accessible format to non-experts. Existing approaches using pre-trained language models, possibly augmented with external background knowledge, tend to struggle with effective simplification and explanation. Moreover, automated methods that can effectively assess the `layness' of generated summaries are lacking. Recently, large language models (LLMs) have demonstrated a remarkable capacity for text simplification, background information generation, and text evaluation. This has motivated our systematic exploration into using LLMs to generate and evaluate lay summaries of biomedical articles. We propose a novel \textit{Explain-then-Summarise} LS framework, which leverages LLMs to generate high-quality background knowledge to improve supervised LS. We also evaluate the performance of LLMs for zero-shot LS and propose two novel LLM-based LS evaluation metrics, which assess layness from multiple perspectives. Finally, we conduct a human assessment of generated lay summaries. Our experiments reveal that LLM-generated background information can support improved supervised LS. Furthermore, our novel zero-shot LS evaluation metric demonstrates a high degree of alignment with human preferences. We conclude that LLMs have an important part to play in improving both the performance and evaluation of LS methods.
Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature
Goldsack, Tomas, Zhang, Zhihao, Lin, Chenghua, Scarton, Carolina
Lay summarisation aims to jointly summarise and simplify a given text, thus making its content more comprehensible to non-experts. Automatic approaches for lay summarisation can provide significant value in broadening access to scientific literature, enabling a greater degree of both interdisciplinary knowledge sharing and public understanding when it comes to research findings. However, current corpora for this task are limited in their size and scope, hindering the development of broadly applicable data-driven approaches. Aiming to rectify these issues, we present two novel lay summarisation datasets, PLOS (large-scale) and eLife (medium-scale), each of which contains biomedical journal articles alongside expert-written lay summaries. We provide a thorough characterisation of our lay summaries, highlighting differing levels of readability and abstractiveness between datasets that can be leveraged to support the needs of different applications. Finally, we benchmark our datasets using mainstream summarisation approaches and perform a manual evaluation with domain experts, demonstrating their utility and casting light on the key challenges of this task.
Enhancing Biomedical Lay Summarisation with External Knowledge Graphs
Goldsack, Tomas, Zhang, Zhihao, Tang, Chen, Scarton, Carolina, Lin, Chenghua
Previous approaches for automatic lay summarisation are exclusively reliant on the source article that, given it is written for a technical audience (e.g., researchers), is unlikely to explicitly define all technical concepts or state all of the background information that is relevant for a lay audience. We address this issue by augmenting eLife, an existing biomedical lay summarisation dataset, with article-specific knowledge graphs, each containing detailed information on relevant biomedical concepts. Using both automatic and human evaluations, we systematically investigate the effectiveness of three different approaches for incorporating knowledge graphs within lay summarisation models, with each method targeting a distinct area of the encoder-decoder model architecture. Our results confirm that integrating graph-based domain knowledge can significantly benefit lay summarisation by substantially increasing the readability of generated text and improving the explanation of technical concepts.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > China > Hong Kong (0.04)
- (10 more...)
Background Knowledge Grounding for Readable, Relevant, and Factual Biomedical Lay Summaries
Communication of scientific findings to the public is important for keeping non-experts informed of developments such as life-saving medical treatments. However, generating readable lay summaries from scientific documents is challenging, and currently, these summaries suffer from critical factual errors. One popular intervention for improving factuality is using additional external knowledge to provide factual grounding. However, it is unclear how these grounding sources should be retrieved, selected, or integrated, and how supplementary grounding documents might affect the readability or relevance of the generated summaries. We develop a simple method for selecting grounding sources and integrating them with source documents. We then use the BioLaySum summarization dataset to evaluate the effects of different grounding sources on summary quality. We found that grounding source documents improves the relevance and readability of lay summaries but does not improve factuality of lay summaries. This continues to be true in zero-shot summarization settings where we hypothesized that grounding might be even more important for factual lay summaries.
- North America > United States > New York > Kings County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
Lay Text Summarisation Using Natural Language Processing: A Narrative Literature Review
Vinzelberg, Oliver, Jenkins, Mark David, Morison, Gordon, McMinn, David, Tieges, Zoe
Summarisation of research results in plain language is crucial for promoting public understanding of research findings. The use of Natural Language Processing to generate lay summaries has the potential to relieve researchers' workload and bridge the gap between science and society. The aim of this narrative literature review is to describe and compare the different text summarisation approaches used to generate lay summaries. We searched the databases Web of Science, Google Scholar, IEEE Xplore, Association for Computing Machinery Digital Library and arXiv for articles published until 6 May 2022. We included original studies on automatic text summarisation methods to generate lay summaries. We screened 82 articles and included eight relevant papers published between 2020 and 2021, all using the same dataset. The results show that transformer-based methods such as Bidirectional Encoder Representations from Transformers (BERT) and Pre-training with Extracted Gap-sentences for Abstractive Summarization (PEGASUS) dominate the landscape of lay text summarisation, with all but one study using these methods. A combination of extractive and abstractive summarisation methods in a hybrid approach was found to be most effective. Furthermore, pre-processing approaches to input text (e.g. applying extractive summarisation) or determining which sections of a text to include, appear critical. Evaluation metrics such as Recall-Oriented Understudy for Gisting Evaluation (ROUGE) were used, which do not consider readability. To conclude, automatic lay text summarisation is under-explored. Future research should consider long document lay text summarisation, including clinical trial reports, and the development of evaluation metrics that consider readability of the lay summary.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)